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@ TGF-beta induced gene family. 

@ A new gene family induced by TGF-beta is 
disclosed. Two new genes, designated 3IG-M1 
and PIG-M2, are induced in response to 
TGF-P1 treatment of mouse embryo fibrob- 
lasts. These genes encode proteins containing 
about 345 to about 380 amino acid residues, 
with . a molecular weight of about 37,000 to 
about 48,000 daltons and about 38 cysteine 
residues. The induced proteins share about 
50% homology with each other and significant 
homology with a v-src Induced protein in chick- 
en embryo fibroblasts designated CEF-10. 
These proteins may be involved in producing 
some of the growth and differention modulating 
effects of TGF-pl 
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TECHNICAL FIELD OF THE INVENTION 

Th pr s nt inv ntion is directed to the induction of a new gene family in r spons to TGF-b ta adminis- 
tration to target cells In culture. Two specifically Induced genes were isolated and characterized. 

5 

BACKGROUND OF THE INVENTION 

Transforming growth factor-pi (TGF-p1) Is a multifunctional regulator of cell growth and differentiation^ It 
is capable of causing diverse effects such as Inhibition of the growth cf monkey l<idney cells, (Tucker, R.F., 

10 G.D. Shipley. H.L. Moses A R.W. Moliey (1984) Science 226:705-707) inhibition of growth of several human 
cancer cell lines, (Roberts, A.B., M.A. Anzano, L.M. Wakefiled, N.S. Roches, D.F. Stem & M.B. Spom (1985) 
Proc. Natl. Acad. Sci. USA 82:119-123; Ranchalls, J.E., LE. Gentry, Y. Agawa, S.M. Seyedin, J. McPherson, 
A. Purchio & D.R. Twardzik (1987) Biochem. Biophys. Res. Commun. 148:783-789) inhibition of nrouse kerati- 
nocytes, (Coffey, R.J., N.J. Sipes, C.C. Bascum, R. Gravesdeal, C. Pennington, B.E. Weissman & H.L. Moses 

15 (1988) Cancer Res. 48: 1596-1602; Reiss. M. & C.L Dibble (1988) In Vitro Cell. Dev. Biol. 24:537-544) stimu- 
lation of growth of AKR-2B fibroblasts (Tucker, R.F., M.E. Olkenant, E.L. Branum & H.L. Moses (1 988) Cancer 
Res. 43:1581-1586) and normal rat kidney fibroblasts, (Roberts, AB., M.A. Anzano, LC. Lamb, J.M. Smith & 
M.B. Sporn (1 981) Proc. Natl. Acad. Scl. USA 78:5339-5343) stimulation of synthesis and secretion of fibronec- 
tin and collagen, (Ignotz, R. A. & J. Massague (1986) J. Biol. Chem. 261:4337-4345; Centrella, M.. T.L. McCar- 

20 thy & E. Canalls, (1987) J. Biol. Chem. 262:2869-2874) Induction of cartilage-specific macromolecule 
production In muscle mesenchymal cells, (Seyedin, S. M., A. Y. Thompson. H. Bentz. D.M. Rosen, J. McPher- 
son, A. Contin, N.R. Siegel, G.R. Galluppi & K.A. Piez (1986) J. Biol. Chem. 261:5693-5695) and growth inhibi- 
tion of T and B lymphocytes. (Kehrl, J.H., L.M. Wakefiled, A.6. Roberts, S. Jakeoview. M. Alvarez-Mon. R. 
Derynck, M.B. Spom & A.S. Fauci (1986) J. Exp. Med. 163:1037-1050; Kehrl, J.H., A.B. Roberts, LM. 

25 Wakefield, S. Jakoview, M.B. Sporn & A.S. Fauci (1987) J. Immunol. 137:3855-3860: Kasid, A.. G.I. Bell & E.P. 
Director, (1988) J. Immunol. 141:690-698; Wahl, S.M., D.A. Hunt, H.L Wong, S. Dougherty, N. McCartney- 
Francis, L.M. Wahl, L Ellingsworth. J.A. Schmidt. G. Hall, A.B. Roberts & M.B. Sporn (1988) J. Immunol. 
140:3026-3032) 

Recent investigations have indicad that TGF-pi is a member of a family of closely related growtii-modulat- 
30 Ing proteins including TGF-P2. (Seyedin, S.M., P.R. Segarini, D.M. Rosen, A.Y. Thompson, H. Bentz & J. 
Graycar (1987) J. Biol. Chem. 262:1946-1949; Cheifetz. S., J.A. Weatherbee, M.L.-S. Tsang, J.K. Anderson, 
J.E. Mole, R. Lucas & J. Massague (1987) Cell 48:409-415; Ikeda, T., M.M. Lioubin & H. Marquardt (1987) 
Biochemistry 26:2406-2410) TGF-p3, (TenDljke, P., P. Hansen, K. Iwata, C. Pieler & J.G. Foulkes (1988) Proc. 
Nati. Acad. Scl. USA 85:4715-4719; Derynck, R.. P. Llndquisl, A. Lee, D. Wen. J. Tamm. J.L Graycar. L Rhee, 
35 A.J. Mason, DA Miller. R.J. Coffey. H.L Moses & E.Y. Chen (1988) EMBO J. 7:3737-3743; Jakowlew, S.B., 
P.J. Dillard, P. Kondalah, M.B. Spom & A.B. Roberts (1 988) Mol. Endocrinology. 2: 747-755) TGF-p4, (Jakow- 
lew, S. B., P. J. Dillard, M. B. Spom & A.B. Roberts (1 988) Mol. Endocrinology. 2:1 186-1 195) Mullerian inhibitory 
substance, (Cate. R.L, R.J. Mattaliano, C. Hession, R. Tizard, N.M. Faber, A. Cheung, E.G. Ninfa, AZ. Frey, 
D.J. Dash, E.P. Chow, R.A. Fisher, J.M. Bertonis, G. Tonres. B.P. Wallner. K.L. Ramachandran, R.C. Ragin, 
40 T.F. Manganaro. D.T. Maclaughlln & P.K. Donahoe (1986) Cell 45:685-698) and the inhibins. (Mason, A. J., 
J.S. Hayflick, N. Ling, F. Esch, N. Ueno. S.-Y. Ying, R. Gulllemin, H. Niall & P.H. Seeburg (1985) Nature 
318:659-663) 

TGF-pi is a 24-kDa protein consisting of two identical disulfide-bonded 12 kD subunits. (Assoian. R.K., A. 
Komoriya. C.A. Meyers. D.M. Miller & M.B. Sporn (1 983) J. Biol. Chem. 258:7155-7160; Frolik, C.A.. L.L Dart, 

45 C.A. Meyers, D.M. Miller & M.B. Sporn (1983) Proc. Nati. Acad. Scl. USA 80:3676-3680; Frolik. C.A., L.M. 
Wakefiled. D.M. Smith & M.B. Sporn (1984) J. Biol. Chem. 259:10995-11000) Analysis of cDNA clones coding 
for human, (Derynck, R., J.A. Jarrett, E.Y. Chem, D.H. Eaton, J.R. Bell, R.K. Assoian, A.B. Roberts. M.B. Spom 
& D.V. Goeddel (1985) Nature 316:701-705) murine, (Derynck, R., J.A. Jarrett, E.Y. Chem, & D.V. Goeddel 
(1986) J. Biol. Chem. 261:4377-4379) and simian (Sharpies, K.. G.D. Plowman, T.M. Rose. D.R. Twardzik & 

so A.F. Purchio (1987) DNA 6:239-244) TGF-pi indicates that this protein is synthesized as a larger 390 amino 
acid pre-prb-TGF-pl precursor; the carboxyl terminal 112 amino acid portion is then proteolytically cleaved to 
yield the TGF-p1 monomer. 

The simian TGF-pi cDNA done has been expressed to high levels In Chinese hamster ovary (CHO) cells. 
Analysis of the proteins s creted by these cells using sitespeclfic antipeptide antibodies, peptid mapping, and 

55 protein sequencing r vealed that both mature and precursor fomns of TGF-p w r produced and wer held 
together, in part, by a complex array of disulfide bonds. (Gentry, LE., N.R. Webb, J, Lim, A. M. Brunner. J.E. 
Ranchalis, D.R. Twardzik, M.N. Lioubin, H. Marquardt & A. F. Purchio (1987) Mol. Cell Biol. 7:3418-3427; G n- 
try, L.E., M.N. Lioubin, A.F. Purchio & H. Marquardt (1988) Mol. Cell. Biol. 8:4162-4168) Upon purification away 
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from the 24kD mature rTGF-pi, the 90 to 110 kD pr cursor compi x was found to consist of thre species: 
pro-TGF-pl, the pro-region of the TGF-pi precursor, and mature TGF-pi. (Gentry, L.E., N.R. Webb, J. Lim, 
A.l^. Brunner. J.E. Ranchalis, D.R. Twardzik, M.N. Lloubin, H. Marquardt & A.F. Purchio (1987) Mol. C II Biol. 
7:3418-3427; Gentry, L.E., IVI.N. Lioubin, A.F. Purchio & H. Marquardt (1988) Mol. Cell. Biol. 8:4162-4168) 
5 Detection of optimal biological activity required acidification before analysis, indicating that rlGF-pi was sec- 
reted in a latent fornr^. 

The pro-region of the TGF-pi precursor was found to be glycosylated at three sites (Asn 82, Asn 1 36. and 
Asn 176) and the first two of these (Asn 82 and Asn 136) contain mannose-6-phosphate residues. (Bpjnner, 
A.M., L.E. Gentry, J.A. Cooper & A.F. Purchio (1988) Mcl. Cell 3ioi. 6:2229-2232; Purchio, A.F., J.A. Cooper, 

10 A.M. Sruriner, M.N. Lioubin, L.E. Gentry, K.S. Kovacina, R.A. Roth & H. Marquardt, (1988) J. Biol. Chem. 
263:14211-14215) In addition, the rTGF-pi precursor Is capable of binding to the mannose-6-phosphate recep- 
tor and may imply a mechanism for delivery to lysomes where proteolytic processing can occur. (Kornfeld, S. 
(1986) J. Clin. Ivest. 77:1-6) 

TGF-P2 is also a 24-kD homodimer of identical disulfide-bonded 1 12 amino acid subunits (Marquardt, H., 

15 M.N. Lioubin & T. Ikeda (1987) J. Biol. Chem. 262:12127-12131). Analysis of cDNA clones coding for human 
(Madisen, L, N. R. Webb, T.M. Rose, H. Marquardt, T. Ikeda, D. Twardzik, S. Seyedin & A.F. Purchio. (1988) 
DNA 7:1-8; DeMartin, R.. B. Plaendler, R. Hoefer-Warbinek, H. Gaugitsch, M. Wrann, H. Schlusener. J.M. 
Seifert, S. Bodmer, A. Fontana & E. Hoefer. EMBO J. 6:3673-3677) and simian (Hanks, S.K., R. Amnour, J.H. 
Baldwin, F. Maldonado, J. Spiess & R.W. Holley (1988) Proc. Natl. Acad. Sci. USA 85:79-82) TGF-p2 showed 

20 that it, too, Is synthesized as a larger precursor protein. The mature regions of TGF-pi and TGF-p2 show 70% 
homology, whereas 30% homology occurs In the proregion of the precursor. In the case of simian and human 
TGF-p2 precursor proteins differing by a 28 amino acid insertion in the pro-region; mRNA coding for these two 
proteins is thoughttooccur via differential splicing (Webb, N.R., L. Madisen, T.M. Rose & A.F. Purchio (1988) 
DNA 7:493^97). 

25 

SUMMARY OF THE INVENTION 

The present invention is directed to the Induction in mammalian cells of a new family of genes in response 
to TGF-beta administration. The induced genes encode a class of similar proteins containing about 345 to about 

30 380 amino acid residues, having a molecular weight of about 37,000 daltons to about 45,000 daltons and con- 
taining about 38 cysteine residues. The cysteine residues are substantially conserved and these proteins share 
about 50% homology with each other. The induced gene products further share extensive homology with a pro- 
tein induced by v-src in chicken embryo fibroblasts. 

The present Invention specifically discloses the induction by TGF-beta in mouse embryo cells of a gene 

35 family encoding proteins designated as piG-M1 and piG-M2 (beta-induced gene-mouse 1 and 2, respectively) 
that share about 80% and 50% homology, respectively with the CEF-10 protein induced by v-src In chicken 
embryo fibroblasts. The nucleotide sequences for plG-M1 and piG-M2 were elucidated and compared. The 
induction of the genes of the present invention by TGF-beta had not been previously reported or envistoned. 

40 DESCRIPTION OF THE FIGURES 

In the drawings: 

FIGURE 1 illustrates the nucleotide and deduced amino acid sequences of pIG-MI, and corresponds to 
Sequence I.D. No. 1. 

45 FIGURE 2 Illustrates the nucleotide and deduced amino acid sequences of piG-M2, and corresponds to 

Sequence I.D. No. 3. 

FIGURE 3 illustrates Northern Blot Analysis of piG-M1 and piG-M2 RNA. Total RNA was extracted from 
AKR-2B cells (Purchio and Fareed (1979) J, Virol. 29:763-769), fractionated on a 1% agarose-formaldehyde 
gel (Lehrach et al., ( 1977) Biochemistry 16:4743-4751 ) and hybridized to p2P]-labelled piG-M1 (A) or piG-M2 
50 (C) probes. Lane 1 , AKR-2B; Lane 2, AKR-2B and TGF-pi ; Lane 3. AKR-2B and cyclohexamide; Lane 4, AKR- 
2B and cyclohexamide and TGF-pi. The gels shown in panels A and C were stained with methylene blue and 
photographed (B and D) to show equal loading of RNAs. 

FIGURE 4 illustrates the alignment of amino acid residue sequences for piG-M1 and CEF-10 proteins. Resi- 
dues that ar identical in both sequences are indicated by (:). 
55 FIGURE 5 illustrates the alignment of amino acid residu sequences for piG-M2 and CEF-10 proteins. Resi- 

dues that are identical in both sequences are indicated by (:). 

FIGURE 6 illustrates the alignment ofamino acid residue sequences for PIG-M2 and pIG-MI proteins. R si- 
dues that are id ntical in both sequences are indicated by (:). 
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FIGURE 7 illustrates the multiple sequenc alignment of region II of CS protein. The alignment shown is 
between 8 prot in sequ nces. An asterisk (*) indicated the positions wh reaiignm nt is perfectly cons rv d, 
and a dot (.) indicates thos positions that are w II conserved. 
The aligned regions represented ar : 
5 . PIG-M1: amino acid residues 227-286 (60 residues) 

. CEF12CS (CEF10): amino acid residues 224-283 (60 residues) 
. piG-M2: amino acid residues 198-257 (60 residues) 

. PFALCIPACS (P. Falciparum CS protein region II) : amino acid residues 340-395 (55 residues) 
. PROPERDCSR (Properdin) : consensus of 6 repots (60 resld'JSG) 
^0 • THROMBOCS (TrQmKgspQr,|j;nj ; repeat region, amino acid residues 420-476 (56 residues) 

. PFALTRAPCS (P. Falciparum TRAP) : amino acid residues 244-291 (48 residues) 
. C7C0MPCS (C7 terminal complement motif) : amino acid residues 8-63 (56 residues) 
FIGURE 8 illustrates a Southern blot analysis of mouse genomic DNAwith ppiG-M2. High molecular weight 
DNA was extracted from mouse kidneys, digested with Bam HI (lane 1), Eco Rl (lane 2), Hind III (lane 3) or 
15 SstI (lane 4) and analyzed by Southern blotting with pP]-labeled ppiG-M2 (panel A) or pP]-labeled ppiG-M1 
(panel B). 

DESCRIPTION OF PREFERRED EMBODIMENTS 

20 The present invention is directed to the induction of a gene family by TGF-beta administration to target cells. 

The genes encode a family of proteins having about 345 to about 380 amino acid residues, having a molecular 

weight of about 37,000 daltons to about 45,000 daltons and containing about 38 cysteine residues. 

TGF-pi is known to regulate the transcription of several genes, such as the genes encoding c-myc, c-sis, 

the receptor for platelet derived growth factor (PDGF) and TGF-betal. The proteins encoded by the TGF-betal 
25 induced genes are likely involved in mediation of the biological effects of TGF-betal relating to cell growh and 

differentiation. 

Alt amino acid residues identified herein are in the natural of L-configuration. In keeping with standard 
polypeptide nomenclature, abbreviations for amino acid residues are as follows: 
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SYMBOL 



AMINO AdD 


3-Letter 


1 -Letter 


Alanine 


Ala 


A 


rvrgiuinc 


Arg 




Asparagine 


Asn 


KT 


Aspaitic 3.CIQ 


Asp 


1-^ 

U 


Aspartic d.cid or Asparagine 


Asx 


13 


Cysteine 


uys 




VJiUUtilUUC 


vjfin 


n 


uiuiamic acLu 


UlU 


Jb 


vjiycinc 


Vwjiy 


r: 


vjiuianiic aciQ or oiuiaminc 


Olx 


'7 

£^ 


riisuQine 


rllS 


rl 


isoieucine 


lie 


T 

1 




T Al 1 


T 

L> 




Lys 


XV 


Methionine 


Met 


M 


Phenylalanine 


Phe 


F 


Proline 


Pro 


P 


Serine 


Ser 


S 


Threonine 


Thr 


T 


Tryptophan 


Trp 


W 


Tyrosine 


Tyr 


Y 


Valine 


Val 


V 



In the present invention it was found that when cells are treated with TGF-betal, at least one new class of 
genes was transcriptionally activated. This class of genes was established by Isolating the RNAfrom the treated 
35 cells, processing it, and then preparing cDNAfronn the RNA. The cDNA was further cloned and a library of genes 
prepared. 

As used herein, the tenm "library" refers to a large random collection of cloned DNA fragments obtained 
from the transcription system of interest. The gene library was then screened with labelled cDNA probes 
obtained from TGF-beta treated and untreated cells. This approach led to the detection of TGF-betal induced 
40 genes. 

In a prefened embodiment, mouse AKR-2B ceils (obtained from Dr. M. Moses, Vanderbllt University, 
Nashville, TN. ) were treated with TGF-beta1, and two new genes, designated piG-M1 and piG-M2, respect- 
ively, were elucidated. The coding sequences for these genes were obtained by cDNA cloning of the polyadeny- 
lated RNA Isolated from the AKR-2B cells. The entire coding region was sequenced and then compared to 

45 known published sequences. The deduced amino acid sequences of the piG-M1 and piG-M2 gene products 
demonstrated about 80% and 50% homology, respectively, with CEF-10, a gene induced by v-src in chicken 
embryo fibroblasts (Simmons et al. (1989) Proc. Natl.. Acad. Sci. USA. 86:1178). Comparison and alignment 
of the amino acid sequences of CEF-10 with piG-M1 and piG-M2 are shown in FIGURES 1 and 2, respectively. 
It is readily seen that significant homology exists between these proteins and that 38 of the 39 cysteine residues 

50 are conserved. When piG-M 1 and piG-M2 are compared with each other, approximately 50% homology is seen 
between the two sequences. (FIGURE 3) 

Upon further investigation It was found that the C-tenninal cysteine rich domain of CEF-10, piG-M1 , and 
PIG-M2 contain an amino acid sequence motif with strong homology (9 of 12 amino acids) to a motif found near 
the C-terminal of the malarial circumsporozoite (CS) protein. (FIGURE 7) This region of th CS prot in, d sig- 

55 nated'r gion ir,is highly conserved (10 of 12 amino acids) among all sp ciesofmalarial parasit ssequenc d 
to date (Robson, K.J.H.,et al. (1988) Nature 335:79; Rich. K.A.,et.al. (1990) Science 249:1 574). TheCS protein 
is expressed on the surface of Plasmodium species during the sporozoite phase and may b involved in rec- 
ognition and ntry into hepatocytes (Aley, S.B., et al. (1986) J. Exp. Med. 164:1915). 
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The rol of th region II motif in cell adhesion has been demonstrated by using peptide fragments of P.vivax 
CS protein to promote T-cell and myeloid ceil line attachment to microtit r plates (Rich, K. A., et al. ( 1990) 
Science 249:1574). Furthenmore, only peptides overlapping region II were able to inhibit T-cell and myeloid c II 
lines from binding to the CS protein. 
5 The region II CS protein motif (CS motif is also found in other proteins which may have cell adhesive proper- 

ties that mediate cell-cell and cell-extracellular matrix interactions, such as properdin, thrombospondin; throm- 
bospondon related anonymous protein (TRAP) and various complement components. 

Properdin has 6 repeats containing the CS motif. Properdin is invoh^ed in stabilizing the 'alternate' pathway 
of complement which involves the binding of C3b to the surfaccG cf foreign organisms (Goundis, D. and Reid, 
10 K.B.M. (198S) Nature 335:82). 

Thrombospondin has 3 repeats of the CS motif. Data suggest it is a member of a class of adhesive proteins 
secreted by activated platelets and tissue culture celts, associating with the platelet membrane and becoming 
incorporated in fibrin clots and extracellular matrix (Lawler, J. and Hynes, R.O. (1986) J. Cell Bio. 103:1635). 

TRAP is a surface antigen expressed during the blood stage of P. falciparum and may be involved in attach- 
15 ment to erythrocytes (possibly via C3b) prior to invasion (Robson, K.J.H., et al. (1988) Nature 335:79). 

A comparison of the amino acid residue sequences of these proteins is shown in FIGURE 7, and demon- 
strates a high degree to conservation of the region II sequence. 

The N-temiinus and the C-tenminus of complement components C7, C8a, and C8p, and the N-termlnus of 
C9 contain motifs that have weak homology to the CS motif (Goundis, D. and Reid, K.B.M. (1988) Nature 
20 335:82). 

Libraries of cDNA were generated in the present Invention as a means to detect the induction of new genes 
by TGF-beta1 . Double stranded cDNA containing EcoRI cohesive tenmini was ligaled into the unique Ecol clon- 
ing site present in X gt 1 0 DN A. The recombinant DNA was then packaged into viable phage particles and plated 
on appropriate hosts ( E. coll strain Ceoo rK"mK^hFI) for amplification and screening. 
25 >, gt 1 0 is an insertion vector with a cloning capacity of up to 7 kb. The unique EcoRI cloning site is located 
in the X repressor (cl) gene. Insertion of foreign DNA at this restriction site interrupts the cl coding sequence 
and causes the phenotype of the phage to change from cr (wild type) to c\~. Since cl" phage are unable to 
lysogenize the host, dear plaques are produced by the recombinants. When plated on mutant bacteria which 
produce lysogeny, or bacteriophage integration, at a high frequency, only recombinant cr phage produce pla- 
30 ques. Nonrecombinants, such as X gt 10 without an insert, are effectively suppressed from plaque formation. 
This has served in the present invention as the basis for the biological selection for recombinant phage during 
X gt 10 library amplification. 

Selection of the cloned sequences of interest in the present invention was canied out by screening the lib- 
rary with nucleic acid sequences derived from TGF-pi treated and untreated cells. This screening is dependent 
35 upon molecular hybridization by annealing of single-stranded nucleic acid nrK)lecules to form duplex structures 
that are stabilized by sequence-specific hydrogen bonds. Only nucleic acids of related sequence organization 
will base pair, or hybridize, with each other. 

Northern blot analysis as carried out in the present invention allows the detection of rare RNA molecules 
in a cell. In this technique, total cellular RNA is prepared and then resolved into different size classes 
40 electrophoretically. The resolved RNA is then transfen-ed and probed with radiolabelled DNA, followed by 
radioautographic detection of DNA-RNA hybrid duplexes. 

The Northem blot technology was used in the present invention to further characterize piG-M1 and piG-M2. 
The present invention is further described by the following Examples which are intended to be illustrative 
and not limiting. 

45 

EXAMPLE 1 

Isolation of piG-MI and plG-M2 

50 AKR-2B mouse cells, (obtained from Dr. H. Moses. Vanderbilt University, Nashville, TN.) were grown to 

confluency in McCoy's media (GIBCO BRL, Gaithersburg, MD) plus 5% fetal bovine serum (FBS). The ceils 
were then treated with cyclohexamlde (10 ug/ml) for 1 5 minutes. 

TGF-betal (10 ng/ml) was then added to the cells and the cells maintained for 6 hours at about 37**C with 
cyclohexamlde and TGF-betal. 

55 The RNA was extracted from th c lis. Polyad nylated RNA (polyA-RNA) was isolat d by passage of th 

extracted RNA through an oligo-dT cellulose column. The polyA-RNA was then us d to pr par cDNA by us 
of reverse transcriptase. The cDNA was cloned into X gt 10 phage by using an EcoRI bridg r according to the 
method of Webb, N.R. t al., 1987, DNA 6:71-79. 
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A DNA library was prepared and was then screened using two ^^p-iabelled cDNA prob s. The ^^p-iabelled 
cDNA probes w re prepared, respectiv ly, from untreated AKR-2B mRNA and AKR-2B mRNA from c lis 
tir ated with cyctohexamide and TGF-beta1 . Hybridization of the prob s with th DNA library to elicit plaques 
was carried out. Those plaqu s that had hybridized strongly with the probe from treated cells were isolated 

5 and further purified. The DNA from the tertiary plaques were cut with EcoR1 and then cloned into plasmid 
pEMBL18. Two clones (plG-M1 and piG-M2) were then sequenced. The sequences are shown in FIGURE 1 
and 2 (Sequence I.D. Nos. 1 and 3, respectively). 

Northern blot analysis of the mRNA from treated and untreated cells are shown in FIGURE 3. piG-M1 (Fig- 
ure 3A, lane 2) and piG-M2 (Figure 3C. lane 2) RNAs were significantly Increassd In AKR-2B cells after a 6 

10 hour treatment with TGF-,B1. These RNA were barely detectable in untreated cells (Figures 3A and 3C, lane 
1). Both piG-MI and piG-M2 RNAs were increased by treatment with cyclohexamide alone (FIGURES 3A and 
3C, lane 3) and were even further induced by treatment with the combination of cyclohexaminde and TGF-p1. 
(FIGURES 3A and 3C, lane 4). TGF-pi treatment in the presence of cyclohexamide increased piG-M2 RNA 
to a much higher extent (15 fold) than piG-M1 RNA (3 fold) over those values observed after cyclohexamide 

15 treatment alone. 

Southern blot analysis was canried out using mouse kidney DNA and clearly demonstrated that the two 
probes hybridized to different restriction fragments (FIGURE 8A and B) indicating that piG-M1 and piG-M2 are 
encoded by different genes. It is readily seen that the administration of TGF-pi in the presence of cyc- 
lohexamide significantly induces the production of mRNA for both piG-M1 and piG-M2 (FIGURE 3). A small 
20 amount of constitutive synthesis of these mRNAs is seen in the cyclohexamide treated cells. 

EXAMPLE 2 

Characterization of piG-M1 and PIG-M2 

25 

The amino acid residue sequences for piG-M1 and piG-M2 (sequence I.D. No. 2 and 4, respectively) were 
detenmined and compared. As shown in FIGURE 6 when the two protein sequences are aligned there is a 47.7% 
homology between the sequences with conservation of 38 of the 39 cysteine residues. 

Comparison of the protein sequence with the v-src-induced gene product CEF-10 (Sequence I.D. No. 6) 
30 shows homology of about 80% with piG-M1 (Sequence I.D. No. 2) as seen in FIGURE 4. and of about 50% 
with PIG-M2 (Sequence I.D. No. 4) as seen in FIGURE 5. 

DNA sequence analysis of pplG-M1 indicated that it contained a single open reading frame coding for a 
379 amino acid polypeptide. As stated above, this protein is about 80% homologous to CEF-10. It was further 
detennined thatpiG-MI protein is identical to the protein encoded by cyr61, as described in O'Brien et al. (1990) 
35 Mol. Cell Biol. 10:3569-3577, an immediate early response gene induced in quiescent BALB 3T3 cells by serum 
treatment. 

DNA sequence analysis of ppiG-M2 (FIGURE 2) indicates a single open reading frame encoding a 348 
amino acid protein. The amino terminal portion of piG-M2 contains a hydrophobic stretch which could function 
as a signal peptide. Beginning at amino acid residue 52 in FIGURE 2, PIG-M2 contains the sequence Gly-Cys- 

40 Gly-Cys-Cys-Arg-Val-Cys which conforms to the Gly-Cys-Gly-Cys-Cys-X-X-Cys motif reported in the amino 
half of insulin-like growth factor (IGF) binding proteins. (Binkert et al. (1988) EMBO J. 8:2497-2502; Albiston 
et al. (1990) Biochem. Biophys. Res. Commun. 16:892-897; Brinkman et al. (1988) EMBO J. 7:2417-2423). 
This motif is also present in piG-M1 at amino acid residues 49 - 56 in Figure 1. 

The foregoing description and Examples are intended as illustrative of the present invention, but not as 

45 limiting. Numerous variations and modifications may be effected without departing from the true spirit and scope 
of the present invention. 



50 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: BRISTOL-MYERS SQUIBB COMPANY 
345 Park Avenue 
New York, New York 10154 
United States of America 

(ii) TITLE OF INVENTION: TGF-BETA INDUCED GENE FAMILY 



(iii) NUMBER OF SEQUENCES: 6 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Joseph M. Sorrentino 

(B) STREET: 3005 First Avenue 

(C) CITY: Seattle 

(D) STATE: Washington 

(E) COUNTRY: USA 

(F) ZIP: 98121 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentin Release #1.24 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US unassigned 

(B) FILING DATE: 18-JAN-1991 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME:. Sorrentino, Joseph M. 

(B) REGISTRATION NUMBER: 32,598 

(C) REFERENCE/DOCKET NUMBER: ON0081- 

( ix) TELECOMMUNICATION INFORMATION : 

(A) TELEPHONE: (206)728-4800 

(B) TELEFAX: (206)448-4775 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 028 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: N 
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(iv) ANTI-SENSE: N 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mus musculus 

(G) CELL TYPE: Fibroblast 

(H) CxlLL LINE: AKk2B 

(viii) POSITION IN GENOME: 

(C) UNITS: bp 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 186. •1322 

(D) OTHER INFORMATION: 
(ix) FEATURE: 

(A) NAME/KEY: mat^peptide 

(B) LOCATION: 186.. 1322 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

GACCGTGAGC GAGAGGCCCA GAGAAGCGCC TGCAATCTCT GCGCTCCTCC GCCAGCACCT 

CGAGAGAAGG ACACCCGCCG CCTCGGCCCT CGCCTCACCG CACTCCGGGC GCATTTGATC 

CCGCTGCTCG CCGGCTTGTT GGTTCTGTGT CGCCGCGCTC GCCCCGGTTC CTCCTGCGCG 

CCACA ATG AGC TCC AGC ACC TTC AGG ACG CTC GCT GTC GCC GTC ACC 
Met Ser Ser Ser Thr Phe Arg Thr Leu Ala Val Ala Val Thr 
15 10 

CTT CTC CAC TTG ACC AGA CTG GCG CTC TCC ACC TGC CCC GCC GCC TGC 
Leu Leu His Leu Thr Arg Leu Ala Leu Ser Thr Cya Pro Ala Ala Cys 
15 20 25 30 

CAC TGC CCT CTG GAG GCA CCC AAG TGC GCC CCG GGA GTC GGG TTG GTC 
His Cyo Pro Leu Glu Ala Pro Lys Cya Ala Pro Gly Val Gly Leu Val 
35 40 45 

CGG GAC GGC TGC GGC TGC TGT AAG GTC TGC GCT AAA CAA CTC AAC GAG 
Arg Asp Gly Cys Gly Cys Cys Lys Val Cys Ala Lys Gin Leu Asn Glu 
50 55 60 

GAC TGC AGC AAA ACT CAG CCC TGC GAC CAC ACC AAG GGG TTG GAA TGC 
Asp Cys Ser Lys Thr Gin Pro Cys Asp His Thr Lys Gly Leu Glu Cys 
65 70 75 

AAT TTC GGC GCC AGC TCC ACC GCT CTG AAA GGG ATC TGC AGA GCT CAG 
Asn Phe Gly Ala Ser Ser Thr Ala Leu Lys Gly He Cys Arg Ala Gin 
80 85 90 

TCA GAA GGC AGA CCC TGT GAA TAT AAC TCC AGA ATC TAC CAA AAC GGG 
Ser Glu Gly Arg Pro Cya Glu Tyr Asn Ser Arg lie Tyr Gin Asn Gly 
95 100 105 110 
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20 



25 



30 



40 



45 



GAA AGC TTC CAG CCC AAC TGT AAA CAC CAG TGC AC A TGT ATT GAT GGC 563 
Glu Ser Phe Gin Pro Asn Cys Lye His Gin Cys Thr Cys lie Aap Gly 
115 120 125 



l^tJV; Xj'l\^ X tri^ AX r l\mx v.*\#v« wvn unn waw www 

Ala Val Gly Cys lie Pro Leu Cya Pro Gin Glu Leu Ser Leu Pro Asn 
130 135 140 

CTG GGC TGT CCC AAC CCC CGG CTG GTG AAA GTC AGC GGG CAG TGC TGT 659 

Leu Gly Cys Pro Asn Pro Arg Leu Val Lyo Val Ser Gly Gin Cys Cys 

145 150 155 

15 GAA CAG TGC GTT TGT GAT GAA GAC AGC ATT AAG GAC TCC CTG GAC GAC 707 

Glu Glu Trp Val Cys Asp Glu Asp Ser lie Lys Asp Ser Leu Asp Asp 
160 165 170 



CAG GAT GAC CTC CTC GGA CTC GAT GCC TCG GAG GTG GAG TTA ACG AG A 755 
Gin Asp Asp Leu Leu Gly Leu Asp Ala Ser Glu Val Glu Leu Thr Arg 
175 180 185 190 

AAC AAT GAG TTA ATC GCA ATT GGA AAA GGC AGC TCA CTG AAC AGG CTT 803 
Aan Asn Glu Leu lie Ala lie Gly Lys Gly Ser Ser Leu Lys Arg Leu 
195 200 205 

CCT GTC TTT GGC ACC GAA CCG CGA GTT CTT TTC AAC CCT CTG CAC GCC 851 
Pro Val Phe Gly Thr Glu Pro Arg Val Leu Phe Asn Pro Leu His Ala 
210 215 220 

CAT GGC CAG AAA TGC ATC GTT CAG ACC ACG TCT TGC TCC CAG TGC TCC 899 
His Gly Gin Lye Cys lie Val Gin Thr Thr Ser Trp Ser Gin Cys Ser 
225 230 235 



AAG AGC TGC GGA ACT GGC ATC TCC ACA CGA GTT ACC AAT GAC AAC CCA 947 
Lys Ser Cys Gly Thr Gly lie Ser Thr Arg Val Thr Asn Asp Asn Pro 
35 240 245 250 

GAG TGC CGC CTG GTG AAA GAG ACC CGG ATC TGT GAA GTG CGT CCT TGT 995 
Glu Cys Arg Leu Val Lys Glu Thr Arg He Cys Glu Val Arg Pro Cys 
255 260 255 270 



GGA CAA CCA GTG TAG AGC AGC CTA AAA AAG GGC AAG AAA TGC AGC AAG 1043 
Gly Gin Pro Val Tyr Ser Ser Leu Lys Lys Gly Lys Lys Cys Ser Lys 
275 280 285 

ACC AAG AAA TCC CCA GAA CCA GTC AGA TTT ACT TAT GCA GGA TGC TCC 1091 
Thr Lya Lys Ser Pro Glu Pro Val Arg Phe Thr Tyr Ala Gly Cys Ser 
290 295 300 



AGT GTC AAG AAA TAG CGG CCC AAA TAC TGC GGC TCC TGC GTA GAT GGC 1139 
Ser Val Lys Lys Tyr Arg Pro Lys Tyr Cys Gly Ser Cys Val Asp Gly 
50 305 310 315 

CGG TGC TGC ACA CCT CTG CAG ACC AGA ACT GTG AAG ATG CGG TTC CGA 1187 

Arg Cys Cys Thr Pro Leu Gin Thr Arg Thr Val Lys Met Arg Phe Arg 
320 325 330 

55 
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15 



20 



25 



30 



35 



TGC GAA GAT GGA GAG ATG TTT TCC AAG AAT GTC ATG ATG ATC CAG TCC 1235 
Cys Glu Aap Gly Glu Met Phe Ser Lys Aan Val Met Met lie Gin Ser 
335 340 345 350 

TGC AAA TGT AAC TAG AAC TGC COG CAT CCC AAC GAG GCA TOG TTC CGA 1283 
Cye Lys Cys Asn Tyr Aen Cya Pro His Pro Asn Glu Ala Ser Phe Arg 
355 360 365 

CTG TAC AGC CTA TTC AAT GAC ATC CAC AAG TTC AGG GAC TAAGTGCCTC 1332 
Leu Tyr Ser Leu Phe Aan Asp lie Hia Lya Phe hcg Asp 
370 375 

CAGGGTTCCT ACTGTGGGCT GGACAGAGGA GAAGCGCAAG CATCATGGAG ACGTGGGTGG 1392 

GCGGAGGATG AATGGTGCCT TGCTCATTCT TGAGTAGCAT TAGGGTATTT CAAAACTGCC 1452 

AAGGGGCTGA TGTGGACGGA CAGCAGCGCA GCCGCAGTTG GAGAATGCCA AGGGGCTGAT 1512 

GTGGACGGAC AGCAGCGCAG CCGCAGTTGG AGAAGACTTC GCTTCATAGT ACTGGAGCGG 1572 

GCATTATTGC TCCATATTGG AGCATGTTTA CGGATGACGT TCTGTTTTCT GTTTGTAAAT 1632 

TATTTGCTAA GTGTATTTTT TTGCTCCAGA CCCCCCCCCC CCCTTTCTTG GTTCTACAAT 1692 

TGTAATAGAG ACAAAATAAG ATTAGTTGGG CCAAGTGAAA GCCCTGCTTG TCCTTTGACA 1752 

GAAGTAAATG AAAGCGCCTC TCATTCCTTC CCGAGCGGAG GGGGGACACT CTGTGAGTGT 1812 

CCTTGGGGCA GCTACCTGCA CTCTAAAACT GCAAACAGAA ACCAGGTGTT TTAAGATTGA 1872 

ATGTTTTTTT ATTTATCAAA GTGTAGCTTT TGGGGAGGGA GGGGAAATGT AATACTGGAA 1932 

TAATTTGTAA ATGATTTTAA TTTTATATCA GTGAAGAGAA TTTATTTATA AAATTAATCA 1992 

TTTAATAAAG AAATATTTAC CTAAAAAAAA AAAAAA 2028 



40 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 379 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ser Ser Ser Thr Phe Arg Thr Leu Ala Val Ala Val Thr Leu Leu 
15 10 15 

His Leu Thr Arg Leu Ala Leu Ser Thr Cys Pro Ala Ala Cys His Cys 
20 25 30 

Pro Leu Glu Ala Pro Lys Cys Ala Pro Gly Val Gly Leu Val Arg Asp 
35 40 45 



55 
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Gly Cys Gly Cys Cys Lys Val Cys Ala Lys Gin Leu Asn Glu Asp Cys 
50 55 60 

Ser Lys Thr Gin Pro Cys Asp His Thr Lys Gly Leu Glu Cys Asn Phe 
65 70 75 80 

Gly Ala Ser Ser Thr Ala Leu Lys Gly lie Cys Arg Ala Gin Ser Glu 
85 90 95 

Gly Arg Pro Cys Glu Tyr Asn Ser Arg lie Tyr Gin Asn Gly Glu Ser 
100 105 110 

Phe Gin Pro Asn Cys Lys His Gin Cys Thr Cys lie Asp Gly Ala Val 
115 120 125 

Gly Cys lie Pro Leu Cys Pro Gin Glu Leu Ser Leu Pro Asn Leu Gly 
130 135 140 

Cys Pro Asn Pro Arg Leu Val Lys Val Ser Gly Gin Cys Cys Glu Glu 
145 150 155 160 

Trp Val Cys Asp Glu Asp Ser lie Lys Asp Ser Leu Asp Asp Gin Asp 
165 170 175 

Asp Leu Leu Gly Leu Asp Ala Ser Glu Val Glu Leu Thr Arg Asn Asn 
180 185 190 

Glu Leu lie Ala lie Gly Lys Gly Ser Ser Leu Lys Arg Leu Pro Val 
195 200 205 

Phe Gly Thr Glu Pro Arg Val Leu Phe Asn Pro Leu His Ala His Gly 
210 215 220 

Gin Lys Cys lie Val Gin Thr Thr Ser Trp Ser Gin Cys Ser Lys Ser 
225 230 235 240 

Cys Gly Thr Gly lie Ser Thr Arg Val Thr Asn Asp Asn Pro Glu Cys 
245 250 255 

Arg Leu Val Lys Glu Thr Arg lie Cys Glu Val Arg Pro Cys Gly Gin 
260 265 270 

Pro Val Tyr Ser Ser Leu Lys Lys Gly Lys Lys Cys Ser Lys Thr Lys 
275 280 285 

Lys Ser Pro Glu Pro Val Arg Phe Thr Tyr Ala Gly Cys Ser Ser Val 
290 295 300 

Lys Lys Tyr Arg Pro Lys Tyr Cys Gly Ser Cys Val Asp Gly Arg Cys 
305 310 315 320 

Cys Thr Pro Leu Gin Thr Arg Thr Val Lys Met Arg Phe Arg Cys Glu 
325 330 335 
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Asp Gly Glu Met Phe Ser Lys Asn Val Met Met lie Gin Ser Cys Lys 
340 345 350 

Cys Asn Tyr Asn Cys Pro His Pro Asn Glu Ala Ser Phe Arg Leu Tyr 
355 360 365 

Ser Leu Phe Asn Asp lie His Lys Phe Arg Asp 
370 375 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2330 base pairs 

(B) TYPE: nucleic acid 

(C) STRAKDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: N 

(iv) ANTI-SENSE: N 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mus musculus 

(G) CELL TYPE: Fibroblast 

(H) CELL LINE: AKR2B 

(viii) POSITION IN GENOME: 

(C) UNITS: bp 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 204.. 1247 

(D) OTHER INFORMATION: 
(ix) FEATURE: 

(A) NAME/ KEY: mat_peptide 

(B) LOCATION: 204.. 1247 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AGACTCAGCC AGATCCACTC CAGCTCCGAC CCCAGGAGAC CGACCTCCTC CAGACGGCAG 60 

CAGCCCCAGC CCAGCCGACA ACCCCAGACG CCACCGCCTG GAGCGTCCAG ACACCAACCT 120 

CCGCCCCTGT CCGAATCCAG GCTCCAGCCG CGCCTCTCGT CGCCTCTGCA CCCTGCTGTG ISO 

CATCCTCCTA CCGCGTCCCG ATC ATG CTC GCC TCC GTC GCA GGT CCC ATC 230 

Met Leu Ala Ser Val Ala Gly Pro lie 
1 5 
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AGC CTC GCC TTG GTG CTC CTC GCC CTC TGC ACC CGG CCT GCT ACG GGC 278 
Ser Leu Ala Leu Val Leu Leu Ala Leu Cys Thr Arg Pro Ala Thr Gly 
10 15 20 25 

CAG GAC TGC AGC GCG CAA TGT CAG TGC GCA GCC GAA GCA GCG COG CAC 326 
Gin Asp Cys Ser Ala Gin Cys Gin Cys Ala Ala Glu Ala Ala Pro Hia 
30 35 40 

TGC CCC GCC GGC GTG AGC CTG GTG CTG GAC GGC TGC GGC TGC TGC CGC 374 
Cys Fro Ala Gly Val Ser Leu Val Leu Asp Gly Cys Gly Cys Cys Arg 
45 50 55 

GTC TGC GCC AAG CAG CTG GGA GAA CTG TGT ACG GAG CGT GAC CCC TGC 422 
Val Cys Ala Lys Gin Leu Gly Glu Leu Cys Thr Glu Arg Asp Pro Cys 
60 65 70 



GAC CCA CAC AAG GGC CTC TTC TGC GAT TTC GGC TCC CCC GCC AAC CGC 470 
Asp Pro His Lys Gly Leu Phe Cys Asp Phe Gly Ser Pro Ala Asn Arg 
20 75 80 85 

AAG ATT GGA GTG TGC ACT GCC AAA GAT GGT GCA CCC TGT GTC TTC GGT 518 
Lys lie Gly Val Cys Thr Ala Lys Asp Gly Ala Pro Cys Val Phe Gly 
90 95 100 105 

25 TCG GTG TAC CGC AGC GGT GAG TCC TTC CAA AGC AGC TGC AAA TAC 566 

Gly Ser Val Tyr Arg Ser Gly Glu Ser Phe Gin Ser Ser Cys Lys Tyr 
110 115 120 

CAA TGC ACT TGC CTG GAT GGG GCC GTG GGC TGC GTG CCC CTA TGC AGC 614 
30 Gin Cys Thr Cys Leu Asp Gly Ala Val Gly Cys Val Pro Leu Cys Ser 
125 130 135 

ATG GAC GTG CGC CTG CCC AGC CCT GAC TGC CCC TTC CCG AG A AGG GTC 662 
Met Asp Val Arg Leu Pro Ser Pro Asp Cys Pro Phe Pro Arg Arg Val 
140 145 150 

35 

AAG CTG CCT GGG AAA TGC TGC GAG GAG TGG GTG TGT GAC GAG CCC AAG 710 
Lys Leu Pro Gly Lys Cys Cys Glu Glu Trp Val Cys Asp Glu Pro Lys 
155 160 165 

40 GAC CGC AC A GCA GTT GGC CCT GCC CTA GCT GCC TAC CGA CTG GAA GAC 758 

Asp Arg Thr Ala Val Gly Pro Ala Leu Ala Ala Tyr Arg Leu Glu Asp 
170 175 . 180 185 

AC A TTT GGC CCA GAC CCA ACT ATG ATG CGA GCC AAC TGC CTG GTC CAG 806 
Thr Phe Gly Pro Asp Pro Thr Met Met Arg Ala Asn Cys Leu Val Gin 
190 195 200 

ACC ACA GAG TGG AGC GCC TGT TCT AAG ACC TGT GGA ATG GGC ATC TCC 854 
Thr Thr Glu Trp Ser Ala Cys Ser Lys Thr Cys Gly Met Gly lie Ser 
205 210 215 

ACC CGA GTT ACC AAT GAC AAT ACC TTC TGC AGA CTG GAG AAG CAG AGC 902 
Thr Arg Val Thr Asn Asp Asn Thr Phe Cys Arg Leu Glu Lys Gin Ser 
220 225 230 



55 
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CGC CTC TGC ATG GTC AGO CCC TGC GAA GOT GAG CTG GAG GAA AAC ATT 950 
Arg Leu Cys Met Val Arg Pro Cys Glu Ala ABp Leu Glu Glu Asn lie 
235 240 245 

AAG AAG GGC AAA AAG TGC ATC CGG ACA CCT AAA ATC CCC AAC CCT GTC 998 

Lyc Lys Gly Lys Lys Cys lie Arg Thr Pro Lye Tie aia lvb Pro Val 
250 255 260 265 

AAG TTT GAG CTT TCT GGC TGC ACC ACT GTO AAG ACA TAG AGG GCT AAG 1046 
Lys Phe Glu Leu Ser Gly Cys Thr Ser Val Lys Thr Tyr Arg Ala Lys 
270 275 280 

TTC TGC GGG GTG TGC ACA GAC GGC CGC TGC TGC ACA CCG CAC AGA ACC 1094 
15 Phe Cys Gly Val Cys Thr Asp Gly Arg Cyo Cys Thr Pro His Arg Thr 
285 290 295 

ACC ACT CTG CCA GTG GAG TTC AAA TGC CCC CAT GGC GAG ATC ATG AAA 1142 
Thr Thr Leu Pro Val Glu Phe Lys Cys Pro Asp Gly Glu Tie Met Lys 
20 300 305 310 

AAG AAT ATG ATG TTC ATC AAG ACC TGT GCC TGC CAT TAC AAC TGT CCT 1190 
Lye Asn Met Met Phe lie Lys Thr Cys Ala Cys His Tyr Asn Cys Pro 
315 320 325 

GGG GAC AAT GAC ATC TTT GAG TCC CTG TAC TAC AGG AAG ATG TAC GGA 1238 
Gly Asp Asn Asp He Phe Glu Ser Leu Tyr Tyr Arg Lys Met Tyr Gly 
330 335 340 345 

GAC ATG GCG TAAAGCCAGG AAGTAAGGGA CACGAACTCA TTAGACTATA 12 87 

30 Asp Met Ala 



ACTTGAACTG AGTTGCATCT CATTTTCTTC TGTAAAAACA ATTACAGTAG CACATTAATT 1347 

35 TAAATCTGTG TTTTTAACTA CCGTGGGAGG AACTATCCCA CCAAAGTGAG AACGTTATGT 1407 

CATGGCCATA CAAGTAGTCT GTCAACCTCA GACACTGGTT TCGAGACAGT TTACACTTGA 1467 

CAGTTGTTCA TTAGCGCACA GTGCCAGAAC GCACACTGAG GTGAGTCTCC TGGAACAGTG 1527 

GAGATGCCAG GAGAAAGAAA GACAGGTACT AGCTGAGGTT ATTTTAAAAG CAGCAGTGTG 1587 

CCTACTTTTT GGAGTGTAAC CGGGGAGOGA AATTATAGCA TGCTTGCAGA CAGACCTGCT 1647 

CTAGCGAGAG CTGAGCATGT GTCCTCCACT AGATGAGGCT GAGTCCAGCT GTTCTTTAAG 1707 

AACAGCAGTT TCAGCTCTGA CCATTCTGAT TCCAGTGACA CTTGTCAGGA GTCAGAGCCT 1767 

TGTCTGTTAG ACTGGACAGC TTGTGGCAAG TAAGTTTGCC TGTAACAAGC CAGATTTTTA 1827 

50 TTGATATTGT AAATATTGT6 GATATATATA TATATATATA TATATTTGTA CAGTTATCTA 1887 

AGTTAATTTA AAGTCATTTG TTTTTGTTTT AAGTGCTTTT GGGATTTTAA ACTGATAGCC 1947 

TCAAACTCCA AACACCATAG GTAGGACACG AAGCTTATCT GTGATTCAAA ACAAAGGAGA 2007 

55 
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TACTGCAGTG GGAATTGTGA CCTGAGTGAC TCTCTGTCAG AACAAACAAA TGCTGTGCAG 2067 

GTGATAAAGC TATGTATTGG AAGTCAGATT TCTAGTAGGA AATGTGGTCA AATCCCTGTT 2127 

GGTGAACAAA TGGCCTTTAT TAAGAAATGG CTGGCTCAGG GTAAGOTCCG ATTCCTACCA 2187 

GGAAGTGCTT GCTGCTTCTT TGATTATGAC TGOrxfuOWi TGGCOGGCAG TTTATTTGTX 2 2 A 7 

GAGAGTGTGA CCAAAAGTTA CATGTTTGCA CTTTCTAGTT GAAAATAAAG TATATATATA 2 307 

TTTTTATATG AAAAAAAAAA AAA 2330 



5 



15 



20 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 348 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

25 Met Leu Ala Ser Val Ala Gly Pro lie Ser Leu Ala Leu Val Leu Leu 

1 5 10 15 

Ala Leu Cys Thr Arg Pro Ala Thr Gly Gin Asp Cys Ser Ala Gin Cys 
20 25 30 

30 

Gin Cya Ala Ala Glu Ala Ala Pro His Cys Pro Ala Gly Val Ser Leu 
35 40 -45 

Val Leu Asp Gly Cys Gly Cys Cys Arg Val Cys Ala Lys Gin Leu Gly 
50 55 60 

Glu Leu Cys Thr Glu Arg Aap Pro Cys Asp Pro His Lys Gly Leu Phe 
65 70 75 80 

Cys Asp Phe Gly Ser Pro Ala Asn Arg Lys lie Gly Val Cys Thr Ala 
^ 85 90 95 

Lys Asp Gly Ala Pro Cys Val Phe Gly Gly Ser Val Tyr Arg Ser Gly 
100 105 110 

45 Glu Ser Phe Gin Ser Ser Cys Lys Tyr Gin Cys Thr Cys Leu Aap Gly 

lis 120 125 

Ala Val Gly Cys Val Pro Leu Cys Ser Met Asp Val Arg Leu Pro Ser 
130 135 140 



35 



50 



55 



Pro Asp Cya Pro Phe Pro Arg Arg Val Lys Leu Pro Gly Lys Cys Cys 

145 150 155 160 

Glu Glu Trp Val Cys Asp Glu Pro Lys Asp Arg Thr Ala Val Gly Pro 

165 170 175 
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Ala Leu Ala Ala Tyr Arg Leu Glu Asp Thr Phe Gly Pro Asp Pro Thr 
180 185 190 

Met Met Arg Ala Asn Cys Leu Val Gin Thr Thr Glu Trp Ser Ala cys 
195 200 205 

Ser Lyo Thr Cys cly Met Gly lie Ser Thr Arg Val Thr Asn Asp Asn 
210 215 220 

Thr Phe Cys Arg Leu Glu Lys Gin Ser Arg Leu Cys Met Val Arg Pro 
225 230 235 240 

Cys Glu Ala Asp Leu Glu Glu Asn He Lys Lys Gly Lys Lys Cys He 
245 250 255 

Arg Thr Pro Lys He Ala Lys Pro Val Lys Phe Glu Leu Ser Gly Cys 
260 265 270 

Thr Ser Val Lys Thr Tyr Arg Ala Lys Phe Cys Gly Val Cys Thr Asp 
275 280 285 

Gly Arg Cys Cys Thr Pro His Arg Thr Thr Thr Leu Pro Val Glu Phe 
290 295 300 

Lys Cys Pro Asp Gly Glu He. Met Lys Lys Asn Met Met Phe He Lys 
305 310 315 320 

Thr Cys Ala Cys His Tyr Asn Cys Pro Gly Asp Asn Asp He Phe Glu 
325 330 335 

Ser Leu Tyr Tyr Arg Lys Met Tyr Gly Asp Met Ala 
340 345 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1804 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: N 

(iv) ANTI-SENSE: N 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Gallus domesticus 

(G) CELL TYPE: Fibroblast 

(H) CELL LINE: CEFIO 

(viii) POSITION IN GENOME: 
(C) UNITS: bp 



17 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 53. •1177 
(D) OTHER INFORMATION: 

(ix) FEATURE: 

(A) NAME/KEY: matjpeptide 



V / J^W J 



(D) OTHER INFORMATION: 
(ix) FEATURE: 

(A) NAME/KEY: sig^peptide 

(B) LOCATION: 53,. 118 
(D) OTHER INFORMATION: 



15 



20 



25 



(X) PUBLICATION INFORMATION: 

(A) AUTHORS: Simmons , Daniel L 
Levy, Daniel B 
Yannoni, Yvonne 
Erikson, R L 
TITLE: Identification of a phorbal ester- 



(B) 

repressible 

(C) 
(D) 
(F) 
(G) 
(K) 

1804 



v-src-inducible 
JOXmNAL: Proc. Natl. 
VOLUME: 86 
PAGES: 1178-1182 
DATE: February-1989 
RELEVANT RESIDUES IN 



gene 
Acad. 



Sci. U.S.A. 



SEQ ID NO: 5: FROM 1 TO 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CCCCCTTCGC GATCGCGTCT CGAGCTCCGC TCTCGCTCCG CGCCGCTAAG AC ATG 55 

Met 
-22 



35 GGC TCT GCG G6A GCT CGC CCC GCG CTG GCG GCC GCC CTG CTC TGC CTG 103 

Gly Ser Ala Gly Ala Arg Pro Ala Leu Ala Ala Ala Leu Leu Cys Leu 
-20 -15 -10 

GCC CGC CTG GCT CTC CGC TCT CCG TGC CCC GCC GTC TGC CAG TGC CCG 151 
40 Ala Arg Leu Ala Leu Gly Ser Pro Cys Pro Ala Val Cys Gin Cys Pro 

-5 1 5 10 

GCC GCC GCG CCG CAG TGC GCC CCG GGC GTG GGG CTG GTG CCG GAC GGC 199 
Ala Ala Ala Pro Gin Cys Ala Pro Gly Val Gly Leu Val Pro Asp Gly 
45 15 20 25 

TGC GGC TGC TGC AAG GTC TGC GCC AAG CAG CTG AAC GAG GAC TGC AGC 247 
Cys Gly Cys Cys Lys Val Cys Ala Lys Gin Leu Asn Glu Asp Cys Ser 
30 35 40 

50 

CGG ACG CAG CCC TGC GAC CAC ACC AAG GGG CTG GAG TGC AAC TTC GGC 295 
Arg Thr Gin Pro Cys Asp His Thr Lys Gly Leu Glu Cys Asn Phe Gly 
45 50 55 

55 
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GCC AGC CCC GCC GCC ACC AAC GGC ATC TGC AGA GCA CAG TCT GAG GGG 
Ala Ser Pro Ala Ala Thr Asn GLy lie Cys Arg Ala Gin Ser Glu Gty 
60 65 70 75 



343 



AGA CCA TGC GAA TAC AAC TCC AAA ATC TAC CAG AAC GGC GAA AGC TTC 
Arg Pro Cys Glu Tyr Asn Ser Lys He Tyr Gin Asn Gly Glu Sen Phe 

•*«- OA 



391 



10 



CAG CCC AAC TGC AAG CAC CAG T6T ACG TGC ATA GAT GGA GCT GTG GGC 
Gin Pro Asn Cys Lys His Gin Cys Thr Cys He Asp Gly Ala Val Gly 
95 too 105 



439 



15 



TGC ATC CCG CTC TGC CCG CAG GAG CTC TCC CTC CCC AAC CTG GGC TGC 
Cys lie Pro Leu Cys Pro Gin Glu Leu Ser Leu Pro Asn Leu Gly Cys 
110 115 120 



487 



20 



25 



CCC AGC CCC AGG CTG GTC AAA GTG CCT GGG CAG TGC TGC GAG GAG TGG 535 
Pro Ser Pro Arg Leu Vai Lys Val Pro Gly Gin Cys Cys Glu Glu Trp 
125 130 135 

GTC TGC GAT GAG AGC AAG GAT GCG CTG GAG GAG CTG GAG GGC TTC TTC 583 
Val Cys Asp Glu Ser Lys Asp Ala Leu Glu Glu Leu Glu Gly Phe Phe 
140 145 150 155 

AGC AAG GAG TTT GGT CTG GAC GCT TCT GAG GGC GAA CTG ACC CGG AAC 631 
Ser Lys Glu Phe Gly Leu Asp Ala Ser Glu Gly Glu Leu Thr Arg Asn 
160 165 170 



30 



AAC CAG CTC ATT GCC ATC GTG AAG GGA GGC CTG AAG ATG CTA CCT GTT 
Asn Glu Leu lie Ala lie Val Lys Gly Gly Leu Lys Met Leu Pro Val 
175 180 185 



679 



35 



TTT GGA TCC GAG CCG CAA AGC CGA GCT TTT GAG AAT CCC AAA TGC ATT 
Phe Gly Ser Glu Pro Gin Ser Arg Ala Phe Glu Asn Pro Lys Cys He 
190 195 200 



727 



40 



GTG CAA ACA ACT TCC TGG TCC CAG TGC TCA AAG ACG TGT GGG ACC GGC 775 
Val Gin Thr Thr Ser Trp Ser Gin Cys Ser Lys Thr Cys Gly Thr Gly 
205 210 215 

ATC TCC ACC AGA GTC ACC AAC CAC AAT CCC GAC TGC AAG CTC ATC AAA 823 
He Ser Thr Arg Val Thr Asn Asp Asn Pro Asp Cys Lys Leu He Lys 
220 225 230 235 



45 



GAG ACC AGG ATA TGC GAA GTG AGG CCG TGT GGC CAG CCC AGC TAC GCC 
Glu Thr Arg He Cys Glu Val Arg Pro Cys Gly Gin Pro Ser Tyr Ala 
240 245 250 



871 



50 



TCC CTG AAG AAG GCA AAA AAA TGT ACC AAG ACT AAG AAG TCC CCA TCC 
Ser Leu Lys Lys Gly Lys Lys Cys Thr Lys Thr Lys Lys Ser Pro Ser 
255 260 265 



919 



55 



CCT GTA AGG TTT ACT TAT GCT GGA TGC TCC AGT GTG AAC AAG TAC CGC 
Pro Val Arg Phe Thr Tyr Ala Gly Cys Ser Ser Val Lys Lys Tyr Arg 
270 275 280 



967 
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CCC AAG TAC TGT GGG TCT TGC GTG GAT GGC AGG TGC TGT ACT CCC CAG 1015 
Pro Lys Tyr Cys Gly Ser Cys Val Asp Gly Arg Cys Cys Thr Pro Gin 
285 290 295 

CAG ACC AGG ACT GTC AAG ATC CGT TTC CGC TGC GAT GAT GGA GAA ACC 1063 
Gin Thr Arg Thr Val Lys lie Arg Phe Arg Cys Asp Asp Gly Glu Thr 

3CC 305 310 315 



TTC ACC AAG ACT GTC ATG ATG ATC CAG TCC TGC CGC TGC AAC TAC AAC 1111 
Phe Thr Lys Ser Val Met Met lie Gin Ser Cys Arg Cys Asn Tyr Asn 
320 325 330 



TGT CCG CAT CCA AAC GAA GCT TAT CCC TTC TAC AGA CTG GTC AAT GAC 1159 
Cys Pro His Ala Asn Glu Ala Tyr Pro Phe Tyr Arg Leu Val Asn Asp 
335 3A0 345 



ATC CAC AAA TTT AGG GAC TAAGTGGTAT TTGGGGTGGG ATGTTAAACA 1207 
He His Lys Phe Arg Asp 
350 

20 

GAATTCTGAA GTAACCAGCC ATGGAGAAAG GACCTCTGAT GGAAGTGGTG CCTTCCCCCA 1267 
TTTGAGGGCA ATATGAGATA TTACAGGAGT GCACTGTGCA ACTGGACACT AATGCGACAG 1327 
2^ AGATTTAAGC ATACTTAAAC CTTCATAGTA CTGGAGCAAC CTTACTCCTT CTTTTTGGAG 1387 

CACCTTTATC TTACACTGTT TTCTGTTTGT AAGTGATCTG ATGTTTTGTT CCGGTTATGA 1447 
AAGCTCTTCC TCTCCCGTTC AGTTTAACAC TACGCTTTTC CCCTCCCCTC CATCTTCTCC 1507 

30 

CCTACTCTCC CAACCAAGTT GGAAGTTACA TTCCTTCCTG AGGTGGGCAC TTGTGGGGTG 1567 



TTCACACTGG CAGCTATTAT GTACCAACTG TAGTTTAATG GCAAACAGAA ATCAGTTGTT 1627 



35 TTAAAGCTGA GTATTTTATT TATCAAACTG TAGCTCTTTT GTTTTCTTTT TTTTTTTTTT 1687 



TAACCCCTTC CAACCCCTGT AATACTGGAA TAA6TTGTAA ATGATTTTAA TTTTATATTC 1747 



GATGAATTAA AAGAATTTAT TTATCGAATT AATCATTTAA TAAAGAAATA TTTACCT 1804 

40 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 375 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



50 



55 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Gly Ser Ala Gly Ala Arg Pro Ala Leu Ala Ala Ala Leu Leu Cys 
-22 -20 -15 -10 
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Leu Ala Arg Leu Ala Leu Gly Ser Pro Cys Pro Ala Val Cya Gin Cys 
-5 1 5 10 

Pro Ala Ala Ala Pro Gin Cys Ala Pro Gly Val Gly Leu Val Pro Asp 
5 15 20 25 

Gly Cya Gly Cys Cya Lye Val Cys Ala Lys Gin Leu Asn Glu Asp Cys 
30 35 40 

10 Ser Arg Thr Gin Pro Cys Asp His Thr Lys Gly Leu Glu Cya Aen Phe 

45 50 55 



15 



Gly Ala Ser Pro Ala Ala Thr Asn Gly He Cys Arg Ala Gin Ser Glu 

60 65 70 

Gly Arg Pro Cys Glu Tyr Asn Ser Lys He Tyr Gin Aan Gly Glu Ser 

75 80 85 90 



20 



Phe Gin Pro Asn Cya Lys His Gin Cys Thr Cys He Asp Gly Ala Val 
95 100 105 

Gly Cys He Pro Leu Cys Pro Gin Glu Leu Ser Leu Pro Asn Leu Gly 
110 115 120 



25 



Cys Pro Ser Pro Arg Leu Val Lys Val Pro Gly Gin Cys Cys Glu Glu 
125 130 135 



30 



Trp Val Cys Asp Glu Ser Lys Asp Ala Leu Glu Glu Leu Glu Gly Phe 
140 145 150 

Phe Ser Lys Glu Phe Gly Leu Asp Ala Ser Glu Gly Glu Leu Thr Arg 

155 160 165 170 



35 



Asn Asn Glu Leu He Ala He Val Lys Gly Gly Leu Lys Met Leu Pro 
175 180 185 

Val Phe Gly Ser Glu Pro Gin Ser Arg Ala Phe Glu Asn Pro Lye Cys 
190 195 200 



40 



He Val Gin Thr Thr Ser Trp Ser Gin Cys Ser Lys Thr Cys Gly Thr 
205 210 215 



Gly He Ser Thr Arg Val Thr Asn Asp Asn Pro Asp Cys Lys Leu He 
220 225 230 



45 



Lys Glu Thr Arg He Cys Glu Val Arg Pro Cys Gly Gin Pro Ser Tyr 
235 240 245 250 



50 



Ala Ser Leu Lys Lys Gly Lys Lys Cys Thr Lys Thr Lys Lys Ser Pro 
255 260 265 

Ser Pro Val Arg Phe Thr Tyr Ala Gly Cys Ser Ser Val Lys Lys Tyr 
270 275 280 



55 
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10 



15 



Arg Pro Lye Tyr Cya Gly Ser Cya Val Asp Gly Arg Cys Cys Thr Pro 
285 290 295 

Gin Gin Thr Arg Thr Val Lys lie Arg Ph© Arg Cys Asp Asp Gly Glu 
300 305 310 

Thr Phe Thr Lya Ser Val Met Met lie Gin Ser Cys Arg Cys Asn Tyr 
315 320 325 ' ' 330 

Asn Cya Pro His Ala Asn Glu Ala Tyr Pro Phe Tyr Arg Leu Val Asn 
335 340 345 

Asp lie His Lye Phe Arg Asp 
350 



Claims 

20 

I. A substantially purified protein comprising about 345 to about 380 amino acid residues, having a molecular 
weight of about 37,000 daltons to about 45,000 daltons and containing about 38 cysteine residues, said 
protein being induced by TGF-beta administration to mammalian cells. 

25 2. The protein according to Claim 1 , wherein the protein has an amino acid residue sequence substantially 
conresponding to the sequence depicted in FIGURE 1 designated as plG-M1 and having Sequence I.D. 
No. 2. 

3. The protein according to Claim 1. wherein the protein has an amino acid residue sequence substantially 
30 corresponding to the sequence depicted In FIGURE 2 designated as piG-M2 and having Sequence I.D. 

No. 4. 

4. The protein according to Qaim 2 encoded by a nucleotide sequence substantially corresponding to the 
sequence of FIGURE 1 and having Sequence I.D. No. 1. 

35 

5. The protein according to Claim 3 encoded by a nucleotide sequence substantially corresponding to the 
sequence of FIGURE 2 and having Sequence I.D. No. 3. 

6. A nucleotide sequence encoding a TGF-beta induced protein substantially conresponding to the nucleotide 
40 sequence depicted in FIGURE 1 and having Sequence I.D. No. 1. 

7. A nucleotide sequence encoding a TGF-beta-induced protein substantially corresponding to the nuc- 
leotide sequence depicted In FIGURE 2 and having Sequence I.D. No. 3. 

45 8, A gene family induced by TGF-beta wherein the induced genes encode a protein comprising about 345 
to about 380 amino acid residues, having a molecular weight of about 37,000 daltons to about 45,000 dal- 
tons and containing about 38 cysteine residues. 

9. The gene famWy according to Claim 8 wherein an induced gene encodes a protein having an amino acid 
50 residue sequence substantially corresponding to the sequence depicted in FIGS 1 and having Sequence 

I.D. No. 2. 

10. The gene family according to Claim 8 wherein an induced gene encodes a protein having an amino acid 
residu sequence substantially corresponding to the sequence depict d in FIGS 2 and having Sequence 

55 I.D. No. 4. 

II. The gen family according to Claim 8 wh rein an induced g ne has a nucleotide s quenc substantially 
corresponding to the sequ nee depict d in FIGURE 1 and having Sequence I.D. No. 1. 
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The gen family according to Claim 8 wherein an induced gene has a nucleotid sequ nc substantially 
corresponding to th s quenc depicted in FIGURE 2 and having Sequence I.D. No. 3. 

A method for the detenml nation of a TGF-p induced g ne comprising th steps of: 

(1) treating a mammalian cell with an effective amount of an inhibitor of mRNA translation for a time 
period sufficient to inhibit protein synthesis; 

(2) further treating said mammalian ceil with an effective amount of TGF-p for a time period sufficient 
to induce mRNA synthesis from TGF-p inducible genes; 

(3) preparing a cDNA library from mRNA isolated from the cell treated according to steps (1) and (2); 

(4) probing the cDNA library with cDNA isolated from the untreated mammalian cell of step (1); 

(5) probing the cDNA library with cDNA isolated from the mammalian cell treated according to steps 
(1)and (2); 

(6) selecting a cDNA detectted in step (4) but not in step (5); and 

(7) sequencing the DNA selected in step (6). 

A method for the production of a protein according to any one of claims 1 to 5 comprising the steps of: 

(1) inserting a nucleic acid coding sequence encoding the protein into an expression vector; 

(2) transfomning or transfecting a mammalian cell with the expression vector; 

(3) culturing the mammalian cell to express the protein; and 

(4) isolating the protein. 
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^I6-N1 CONSENSUS 112790 

6ACCfiT6A6C OAGAGGCCCA GAGAAGCGCC TGCAATCTCT GCGCTCCTCC 6CCAGCACCT 60 

CGA6AGAA6G ACACCC6CC6 CCTCGGCCCT CGCCTCACCG CACTCCGGGC 6CATTTGATC 120 

CCGCTGCTCG CCGGCHGn GGHCTGIGT CGCCGCGCTC GCCCCGGTTC CTCCT6C6CG 180 

CCACA ATG AGC TCC AGC ACC nC AGG ACG CTC GCT GTC GCC GTC ACC 227 
Met Ser Ser Ser Thr Phe Arg Thr Leu AU Val Ala Val Thr 
1 5 10 

CTT CTC CAC TTG ACC AGA CTG GCG CTC TCC ACC TGC CCC GCC GCC 272 
Leu Leu His Leu Thr Arg Leu Ala Leu Ser Thr Cys Pro Ala Ala 
15 20 25 

TGC CAC TGC CCT CTG GAG GCA CCC AAG TGC GCC CCG GGA GTC 6GG 317 
Cys His Cys Pro Leu Glu Ala Pro Lys Cys Al« Pro Gly Val 61 v 
30 35 40 

TTG GTC CGG GAC 6GC TGC 6GC TGC TGT AAG GTC TGC GCT AAA CAA 362 
Leu Val Arg Asp Gly Cys Gly Cys Cys Lys Val Cys Ala Lys Gin 
^5 50 55 



CTC AAC GAG GAC TGC AGC AAA ACT CAG CCC TGC GAC CAC ACC AAG 
Leu Asn Glu Asp Cys Ser Lys Thr Gin Pro Cys Asp His Thr Lys 
60 65 70 



ATC TGC AGA GCT CAG TCA GAA GGC AGA CCC TGT GAA TAT AAC TCC 
He Cys Arg Ala Gin Ser Glu Gly Arg Pro Cys Glu Tyr Asn Ser 
50 95 100 

AGA ATC TAC CAA AAC 66G GAA AGC TTC CAG CCC AAC TGT AAA CAC 
Arg He Tyr Gin Asn Gly Glu Ser Phe Gin Pro Asn Cys Lys His 
105 no 



407 



GGG TTG GAA TGC AAT TTC GGC GCC AGC TCC ACC GCT CTG AAA GGG 452 
Gly Leu Glu Cys Asn Phe Gly Ala Ser Ser Thr Ala Leu Lys Gly 
^5 80 85 



497 



542 



CAG TGC ACA TGT ATT GAT GGC GCC GT6 GGC TGC AH CCT CTG TGT 587 
Gin Cys Thr Cys He Asp Gly Ala Val Gly Cys He Pro Leu Cys 
120 125 130 



FIGURF 1 
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rrr caa t^T*^ * 

Pro Gin c"' ""'^ ^'^ ^ ™^ '^CC AAC CCC CGG 

Pro Gin eiu Leu Ser Leu Pro Asn Leu Gly Cys Pn, Asn Prt, Ar, 
"5 140 „5 ■ 

UU 'iT. S'? f ' ""^ ^ GAG TGG GTT TGT GAT 

Leu V,1 Lys Val Ser Gly Gin Cys Cys Glu Glu Trp V.l Cys Asp 



155 



160 



632 



677 



6AA GAC AGO ATT AA6 GAG TCC CTG GAC 6AC CAG GAT GAC CTC CTC 722 
Glu Asp Ser He Lys Asp Ser Leu Asp Asp Gin Asp Asp Leu Leu 
165 170 175 

GGA CTC GAT GCC TCG GAG GTG GAG TTA ACG AGA AAC AAT GAG TTA 767 
Gly Leu Asp Ala Ser Glu Val Glu Leu Thr Arg Asn Asn Glu Leu 
180 165 190 

ATC GCA ATT GGA AAA GGC AGC TCA CTG AAG AGG CTT CCT GTC TTT 812 
He Ala He Gly Lys Gly Ser Ser Leu Lys Arg Leu Pro Val Phe 
155 200 205 

GGC ACC GAA CCG CGA GTT CTT TTC AAC CCT CTG CAC GCC CAT GGC 857 
Gly Thr Glu Pro Arg Val Leu Phe Asn Pro Leu His Ala His Gly 
210 215 220 

CAG AAA TGC ATC GTT CAG ACC ACG TCT TGG TCC CAG TGC TCC AAG 902 
Gin Lys Cys He Val Gin Thr Thr Ser Trp Ser Gin Cys Ser Lys 
225 230 235 

AGC TGC GGA ACT GGC ATC TCC ACA CGA GTT ACC AAT GAC AAC CCA 947 
Ser Cys Gly Thr Gly He Ser Thr Arg Val Thr Asn Asp Asn Pro 
240 245 250 

GAG TGC CGC CTG GTG AAA GAG ACC CGG ATC TGT (iL GTG CGT CCT 992 
Glu Cys Arg Leu Val Lys Glu Thr Arg He Cys Glu Val Arg Pro 
255 260 265 

TGT GGA CAA CCA GTG TAC AGC AGC CTA AAA AAG GGC AAG AAA TGC 1037 
Cys Gly Gin Pro Val Tyr Ser Ser Leu Lys Lys Gly Lys Lys Cys 
270 275 280 

AGC AAG ACC AAG AAA TCC CCA GAA CCA GTC AGA TTT ACT TAT GCA 1082 
Ser Lys Thr Lys Lys Ser Pro Glu Pro Val Aro Phe Thr Tyr Ala 
285 290 295 



FIGURE 1 ^ronf ) 
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GGA TGC TCC AGT 6TC AAG AAA TAC COG CCC AAA TAC TGC G6C TCC 1127 
Gly Cys Ser Ser v«1 i i n« . ^ I 

' jT' 'J< wys uijr jer 

305 



Ss 51? SJI ^ I"" ^-^C «T 6TG 1172 

Cys V«l Asp Gly Arg Cys Cys Thr Pro Leu Gin Thr Arg Thr Val 

320 325 

JIs JI? f ' II' ?^ TGC GAA GAT GGA GAG ATG TH TCC AAG AAT 
Lys Met Arg Phe Arg Cys Glu Asp Gly Glu Het Phe Ser Lys Asn 
335 

Z lit JI? n' IT I'' '"^^ T'^C CCG CAT 

V« Het Met lie Gin Ser Cys Lys Cys Asn Tyr Asn Cys Pro His 
345 350 355 



1217 



1262 



1307 



CCC AAC GAG GCA TCG TTC CGA CTG TAC AGC CIA TTC AAT GAC ATC 
Pro Asn Glu AU Ser Phe Arg Leu Tyr Ser Leu Phe Asn Asp He 
365 

H?s T". II Z '''''''''' '''''''''' 



375 



GAAGCGCAAG CATCAT6GAG ACGTGGGTGG GCGGAGGATG AATGGTGCCT TGCTCATTCT 1422 
TGAGTAGCAT TAGGGTATTT CAAAACTGCC AAGGGGCTGA TGTGGACGGA CAGCAGCGCA 1482 
GCCGCAGTTG GAGAATGCCA AGGGGCTGAT GTGGACGGAC AGCAGCGCAG CC6CAGTTGG 1542 
AGAAGACTTC GCTTCATAGT ACTGGAGCGG GCATTATTGC TCCATATTGG AGCATGTTTA 1602 
CGGATGACGT TCTGTTTTCT GTTTGTAAAT TATTTGCTAA GTGTATTTTT TTGCTCCAGA 1662 
CCCCCCCCCC CCCTHCTTG GTTCTACAAT TGTAATAGAG ACAAAATAAG ATTAGTTGGG 1722 
CCAAGTGAAA GCCCTGCTTG TCCTTTGACA GAAGTAAATG AAA6CGCCTC TCAHCCTTC 1782 
CCGAGCGGAG 6GGGGACACT CTGTGAGTGT CCHGGGGCA 6CTACCTGCA CTCTAAAACT 1842 
GCAAACAGAA ACCAGGTGTT HAAGATTGA ATGTTTTTH AHTATCAAA GT6TAGCTTT 1902 
TGGGGAGGGA GGGGAAATGT AATACT6GAA TAAHTGTAA AT6ATTTTAA THTATATCA 1962 
GTGAAGAGAA TTTATTTATA AAATTAATCA TTTAATAAAG AAATATTTAC CTAAAAAAAA 2022 

*^ FIGURE 1 rr^nr) 2028 



26 



EP 0 495 674 A2 



PIG-M2 CONSENSUS 112790 

AGAC7CAGCC AGATCeACTC CAGCTCCGAC CCCAG6AGAC CGACCTCCTC CAGACGGCAG 60 

CAGCCCCAGC CCAGCCGACA ACCCCAGACG CCACCGCCTG 6AGCGTCCA6 ACACCAACCT 120 

CCGCCCCTGT CCGAATCCAG GCTCCAGCCG CGCCTCTCGT CGCCTCT6CA CCCTGCTGTG 180 

CATCCTCCTA CCGCGTCCCG ATC AT6 CTC GCC TCC GTC GCA GGT CCC 227 

Het Leu A1a Ser Val Ala Gly Pro 
1 5 

ATC AGC CTC GCC TTG GTG CTC CTC GCC CTC TGC ACC CGG CCT GCT 272 
He Ser Leu Ala Leu Val Leu Leu Ala Leu Cys Thr Arg Pro Ala 
^° 15 20 

ACG GGC CAG GAC TGC AGC GCG CAA TGT CAG TGC GCA GCC GAA GCA 317 
Thr Gly Gin Asp Cys Ser Ala Gin Cys Gin Cys Ala Ala Glu Ala 
25 30 35 



GCG CCG CAC TGC CCC GCC GGC GTG AGC CTG GTG CTG GAC GGC TGC 362 
Ala Pro H,s Cys Pro Ala Gly Val Ser Leu Val Leu Asp Gly Cys 
« 45 50 



GGC TGC TGC CGC GTC TGC GCC AAG CAG CTG GGA GAA CTG TGT ACG 

Gly Cys Cys Arg Val Cys Ala Lys Gin Leu Gly Glu Leu Cys Thr 

55 60 65 

GAG CGT GAC CCC TGC GAC CCA CAC AAG GGC CTC TTC TGC GAT TTC 

Glu Arg Asp Pro Cys Asp Pro His Lys Gly Leu Phe Cys Asp Phe 

^° 75 80 

GGC TCC CCC GCC AAC CGC AAG ATT GGA GTG TGC' GCC AAA GAT 

Gly Ser Pro Ala Asn Arg Lys He Gly Val Cys Thr Ala Lys Asp 

85 90 95 



TCC TTC CAA AGC AGC TGC AAA TAC CAA TGC ACT TGC CTG GAT GG6 
Ser Phe Gin Ser Ser Cys Lys Tyr Gin Cys Thr Cys Leu Asp Gly 
115 120 125 . 

GCC GTG GGC TGC. GTG CCC CTA TGC AGC ATG GAC GTG CGC CTG CCC 
Ala Val Gly Cys Val Pro Leu Cys Ser Het Asp Val Ar^ Leu Pro 
"0 135 

FIGURE 2 



407 



452 



497 



GGT GCA CCC TGT GTC TTC GGT GGG TCG GTG TAC CGC AGC GGT GAG 542 
Gly Ala Pro-Cys Val Phe Gly Gly Ser Val Tyr Arg Ser Gly Glu 

105 110 



587 



632 
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^11 S2 f ^ l"^ '^<^<5 *<5G fiTC AAG CTG CCT fiGG AAA 677 



Ssr Pre *.£c Cv^ Pr- 

150 " ' 155 



145 ' f IT ^'^ "-y* ^'*y i-ys 



TGC TGC GAG GAG TGG GTG TGT 6AC GAG CCC AAG GAC CGC ACA GCA 722 
Cys Cys Glu Glu Trp V«1 Cys Asp Glu Pro Lys Asp Arg Thr Ala 
"0 16S 170 

6TT GGC CCT 6CC CTA GCT GCC TAC C6A CTG GAA GAC ACA TH GGC 767 
V8l Gly Pro Ala Leu Ala Ala Tyr Arg Leu Glu Asp Thr Phe Gly 

180 185 

CCA GAC CCA ACT ATG ATG CGA GCC AAC TGC CTG GTC CAG ACC ACA 812 
Pro Asp Pro Thr Met Met Arg Ala Asn Cys Leu Val Gin Thr Thr 
190 195 200 

GAG TGG AGC GCC TGT TCT AAG ACC TGT G6A ATG GGC ATC TCC ACC 857 
Glu Trp Ser Ala Cys Ser Lys Thr Cys Gly Met Gly He Ser Thr 
205 210 215 

CGA GTT ACC AAT GAC AAT ACC TTC TGC AGA CTG GAG AAG CAG AGC 902 
Arg Val Thr Asn Asp Asn Thr Phe Cys Arg Leu Glu Lys Gin Ser 
220 225 230 

CGC CTC TGC ATG GTC AGG CCC TGC GAA GCT GAC CTG GAG GAA AAC 947 
Arg Leu Cys Met Val Arg Pro Cys Glu Ala Asp Leu Glu Glu Asn 
235 240 245 

ATT AAG AAG GGC AAA AAG TGC ATC CGG ACA CCT AAA ATC GCC AAG 992 
He Lys Lys Gly Lys Lys Cys lie Arg Thr Pro Lys He Ala Lys 
250 255 260 

CCT GTC AAG TTT GAG CTT TCT GGC TGC ACC AGT GTG AAG ACA TAC 1037 
Pro Val Lys Phe Glu Leu Ser Gly Cys Thr Ser Val Lys Thr Tyr 
265 270 275 

AGG GCT AAG TTC TGC GGG GTG TGC ACA GAC GGC CGC TGC TGC ACA 1082 
Arg Ala Lys Phe Cys Gly Val Cys Thr Asp Gly Arg Cys Cys Thr 
280 285 290 

CCG CAC AGA ACC ACC ACT CTG CCA GTG GAG HC AAA TGC CCC GAT 1127 
Pro His Arg Thr Thr Thr Leu Pro Val Glu Phe Lys Cys Pro Asp 
295 300 305 

GGC GAG ATC ATG AAA AAG AAT ATG ATG UC ATC AAG ACC TGT GCC 1172 
Gly Glu He Met Lys Lys Asn Met Met Phe He Lys Thr Cys Ala 
310 315 320 

FIGURE 7 rr»pt,) 
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TGC CAT TAC AAC TGT CCT GGG GAC AAT 6AC ATC ITT GAG TCC CTG 1217 
Cys His Tyr Asn Cys Pro Gly Asp Asn Asp He Phc Glu Scr Leu 
325 330 335 

TAC TAC AGG AAG ATG TAC GGA GAC ATG GCG TAAAGCCAGG AAGTAAGGGA 1267 
Tyr Tyr Arg Lys Met Tyr Gly Asp Met Ala 

340 345 
CACGAACTCA TTAGACTATA ACTTGAACTG AGTTGCATCT CATTTTCnC TGTAAAAACA 1327 

ATTACAGTAG CACATTAATT TAAATCTGTG TTTTTAACTA CCGTGGGAGG AACTATCCCA 1387 
CCAAAGTGAG AACGTTATGT CATGGCCATA CAAGTAGTCT 6TCAACCTCA GACACTGGTT 1447 
TCGAGACAGT TTACACTTGA CAGTTGTTCA TTAGCGCACA 6TGCCAGAAC GCACACTGAG 1507 
GTGAGTCTCC TGGAACAGTG GAGATGCCAG GAGAAAGAAA GACAGGTACT AGCTGAGGTT 1567 
ATTTTAAAAG CAGCAGTGT6 CCTACTTTTT GGAGTGTAAC CGGGGAGGGA AATTATAGCA 1627 
TGCTTGCAGA CAGACCTGCT CTAGCGAGAG CTGAGCAT6T GTCCTCCACT AGATGAGGCT 1687 
GAGTCCAGCT GTTCTTTAAG AACAGCAGTT TCAGCTCTGA CCATTCTGAT TCCAGTGACA 1747 
CTTGTCAGGA GTCAGAGCCT TGTCTGTTAG ACTGGACAGC TTGTGGCAAG TAAGTTTGCC 1807 
TGTAACAAGC CAGATTTTTA TTGATATTGT AAATATTGTG GATATATATA TATATATATA 1867 
•TATATTTGTA CAGTTATCTA AGTTAATTTA AAGTCATTTG HTHCTTTT AAGTGCTTTT 1927 
GGGATTTTAA ACTGATAGCC TCAAACTCCA AACACCATAG GfAGGACACG AAGCTTATCT 1987 
GTGATTCAAA ACAAAGGAGA TACTGCAGTG GGAATTGTGA CCTGAGTGAC TCTCTGTCAG 2047 
AACAAACAAA TGCTGTGCAG GTGATAAAGC TATGTATTGG AAGTCAGATT TCTAGTAGGA 2107 
AATGTGGTCA AATCCCTGTT GGTGAACAAA TGGCCTTTAT TAAGAAATGG CTGGCTCAGG 2167 
GTAAGGTCCG ATTCCTACCA GGAAGTGCH GCTGCTTCTT TGAHATGAC TGGTTTGGGG 2227 
TGGGGGGCAG TTTATTTGTT GAGAGTGTGA CCAAAAGTTA CATGTHGCA CTTTCTAGTT 2287 
GAAAATAAAG TATATATATA nTTTATATG AAAAAAAAAA AAA 2330 
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CEFIO . MGSAGARP-ALAAALLCURLAL6SPCPAVCQCPAAAPQCAPGVGLVP0G -49 

PIG-Ml - MSSSTFRTLAVAVTLLHLTRLAL-STCPAACHCPLEAPKCAPGVGLVRDG -49 

CEFIO - CGCCKVCAKQLNEDCSRTQPCOHTKGLECNFGASPAATNGICRAQSEGRP -99 

^IG-Ml - CGCCKVCAKQLNEDCSKTQPCDHTKGLECNFGASSTALKGICRAQSEGRP -99 

CEFIO " CEYNSKIYQNGESFQPNCKHQCTCIDGAVGCIPLCPQELSLPNLGCPSPR -149 

^IG-Ml - CEYNSRIYQNGESFQPNCKHQCTCIOGAVGCIPLCPQELSLPNLGCPNPR -149 

CEFIO - LVKVPGQCCEEWVCOES — KDALEELEGFFSKEFGLDASEGELTRNNELI -197 

^IG-Hl - LVKVSGQCCEEWVCOEDSIKDSLODQDOLL GLOASEVELTRNNELI -195 

CEFIO - AIVKGG-UMLPVFGSEPQSRAFENP KCIVQTTSWSQCSKTCGT -240 

^IG-Ml - AIGKGSSLKRLPVFGTEP—RVLFNPLHAHGQKCIVQTTSWSQCSKSCGT -243 

CEFIO - GISTRVTNDNPDCKLIKETRICEVRPCGQPSYASLKKGKKCTKTKKSPSP -290 

^IG-Ml - GISTRVTNDNPECRLVKETRICEVRPCGQPVYSSUKGICKCSKTICKSPEP -293 

CEFIO - VRFTYAGCSSVKKYRPKYC6SCVDGRCCTPQQTRTVKIRFRCD0GETFTK -340 

PIG-Ml - VRFTYAGCSSVKKYRPKYCGSCVDGRCCTPLQTRTVKMRFRCEDGEHFSK -343 

'CEFIO - SVMMIQSCRCNYNCPHANEA-YPFYRLVNOIHKFRO -375 

^IG-Ml - HVHMIQSCKCNYNCPHPNEASFRLYSLFNOIHKFRD -379 
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CEFIO - MGSAGARP-ALAAALLCL-ARLALGSPCPAVCQCPA-AAPQCAPGVGLVP -47 

^IG-K2 . MLASVAGPISULVLULCTRPATGQOCSAQCQCAAEMP^^ -50 

CEFIO . OGCGCCKVCAKQLNEDCSRTQPCOHTKGLECNFGASPAATNGICRAQSEG -97 

^IG-H2 - DGCGCCRVCAKQLgLcTEROPCOPHKGLFC^ -99 

CEFIO - RPCEYNSKIYQNGESFQPNCKHQCTCIDGAVGCIPLCPQELSLPNLGCPS -147 

^IG-K2 - APCVFGGSVYRSGESFQSSCKYQCTCliGAVGCV^ -149 

CEFIO - PRLVKVPGQCCEEWVCOESKDALEELEGFFSKEFGLDASEGELTRNNELI -197 

^IG-M2 - PRRVKLPGKCCEEWVCOEPicDR -TAVGP 176 

CEFIO - AIVKGGLKMLPVFGSEPQSRAFENPKCIVQTTSWSQCSKTCGTGISTRVT -247 

• • « 

^IG-M2 - ALAAYRLE--DTFGPOPTM— MRANCLVQTTEWSACSKTCGMGISTRVT -221 

CEFIO - NDNPDCKLIKETRICEVRPCGQPSYASLKKGKKCTKTKKSPSPVRFTYAG -297 

^IG-M2 - NDNTFCRLEKQSRLCMVRPCEADLEENIKKGKKCIRTPKIAKPVKFELSG -271 

CEFIO - CSSVKKYRPKYCGSCVDGRCCTPQQTRTYKIRFRCOOGETFTKSVMMIQS -347 
*,*2f * ** * •^^ • ••••• • 

PIG-M2 - CTSVKTYRAKFCGVCTDGRCCTPHRrrTLPVEF^^ -321 

CEFIO - CRCNYNCPHANEAYP-FYRLVNDIHKFRD -375 

PIG-M2 - CACHYNCPGDNOIFESLYYRICMYG — OKA -348 
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PIG-Ml - HSSSTFRTLAVAVTLLHL-TRL>LST<PAACHCPLEA-PKCAPGVGLVR -47 

PIG-HZ - MUSVAGpIsLALVLLALCTRPATGQOCSAQCQCAAE^ "50 

^IG-Ml . DGCGCCKVCAKQLNEDCSICTQPCDHTICGLECNFGASSTALKGICRAQSEG -97 

PIG-M2 - DGCGCCRVCAKQLGELCTER0PC0PHK6LFC0F6SPANRKIGVCTAK-DG -99 

^IG-Ml - RPCEYNSRIYQNGESFQPNCKHQCTCIDGAVGCIPLCPQELSLPNLGCPN -147 

plQ-HZ ' APCVFGGSVYRSGESFQSSCKYQCTCLDGAVGCVPLCSMDVRLPSPDCPF -149 

PIG-Ml - PRLVKVSGQCCEEWVCOEOSIKDSLODQOOLLGLDASEVELTRNNELIAI -157 

^IG-H2 - PRRVKtPGKCCEEWVCDEPKDRTAVG PALAAYRLEDT 186 

^IG-Hl - GKGSSLKRLPVFGTEPRVLFNPLHAHGQKCIVQTTSWSQCSKSCGTGIST -247 

^IG-M2 FGPOP TMHRAN— CLVQTTEWSACSKTCGHGIST -218 

PIG-Ml - RVTNDNPECRLVKETRICEVRPCGQPVYSSLKKGKKCS<TKKSPEPVRFT -297 

• «««« *•««■« «* ««■ 

#«•*•« #•« m * *• • t*** 

^IG-K2 - RVTNONTFCRLEKQSRLCHVRPCEADLEENIKICGKKCIRTPKIAICPVKFE -268 

^IG-Ml - lAGCSSVKKYRPKYCGSCYDGRCCTfLQTRTVKMRFRCEDGEMFSKNVMM -347 

^IG-M2 - LSGCTSVKTYRAKFCGVCTDGRCCTPHRTTTLPVEFKCPOGEIMKKNKMF -318 

^IG-Ml - IQSCKCNYNCPHPNEASFRLYSLFNDIHKFRO -379 

PIG-M2 - IKTCACHYKCPGDNOIFESLY--YRKMYGDMA -348 
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CIVQTTSWSQCSKSC6TGISTRVT NONPECRL-VKETRICEVR 42 

CIVQTTSWSQCSKTCGTGISTRVT NDNPDCKL-IKETRICEVR 42 

CLVQTTEWSACSKTCGHGISTRYT NDNTFCRL-EKQSRLCMVR 42 

NSI-STEWSPCSVTCGNGIQVRIKPGSANKPKDELDYEN-DIEKKICKME 48 

WSX-WSPWSPCSVTCSXGXQXXXRXRXCXXPAPXX-GXPCAGXAXXXXXQ 48 

WSH-WSPWSSCSVTCGDGV— ITRIRLCNSPSPQHNGKPC—ECEARETK 45 

CGV-WOEWSPCSVTCGKGTRSRKREILHEG CTSEIQEQ— 37 

WDF-YAPWSECN-GCTKTQTRRRSVAVYG QYGGQPCVG—NAFETQ 42 

I I 

region II of CS protein 
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